AITopics | text promptable surgical instrument segmentation

Collaborating Authors

text promptable surgical instrument segmentation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Supplementary Material for Text Promptable Surgical Instrument Segmentation with Vision-Language Models Zijian Zhou

Neural Information Processing SystemsFeb-12-2026, 06:38:17 GMT

They are used in our experiments section. OpenAI GPT -4 based prompts The input template for OpenAI GPT -4 is defined as: Please describe the appearance of [class_name] in endoscopic surgery, and change the description to a phrase with subject, and not use colons. The dataset consists of both training and test cases. Each video is recorded at 25 FPS and has annotations for instruments and operation phases. For EndoVis2019, the results are shown in Tab. 1, our method (input size 448) notably surpasses the competition's top performers, with +3% increase in DSC and +2% enhancement in NSD, which demonstrates the superiority of our method.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.49)

Industry: Health & Medicine > Surgery (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback

Text Promptable Surgical Instrument Segmentation with Vision-Language Models

Neural Information Processing SystemsDec-25-2025, 11:37:22 GMT

In this paper, we propose a novel text promptable surgical instrument segmentation approach to overcome challenges associated with diversity and differentiation of surgical instruments in minimally invasive surgeries. We redefine the task as text promptable, thereby enabling a more nuanced comprehension of surgical instruments and adaptability to new instrument types. Inspired by recent advancements in vision-language models, we leverage pretrained image and text encoders as our model backbone and design a text promptable mask decoder consisting of attention-and convolution-based prompting schemes for surgical instrument segmentation prediction. Our model leverages multiple text prompts for each surgical instrument through a new mixture of prompts mechanism, resulting in enhanced segmentation performance. Additionally, we introduce a hard instrument area reinforcement module to improve image feature comprehension and segmentation precision. Extensive experiments on several surgical instrument segmentation datasets demonstrate our model's superior performance and promising generalization capability. To our knowledge, this is the first implementation of a promptable approach to surgical instrument segmentation, offering significant potential for practical application in the field of robotic-assisted surgery.

name change, text promptable surgical instrument segmentation, vision-language model, (2 more...)

Neural Information Processing Systems

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Supplementary Material for Text Promptable Surgical Instrument Segmentation with Vision-Language Models Zijian Zhou

Neural Information Processing SystemsOct-8-2025, 18:15:31 GMT

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.49)

Industry: Health & Medicine > Surgery (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback

Text Promptable Surgical Instrument Segmentation with Vision-Language Models

Neural Information Processing SystemsJan-18-2025, 14:40:15 GMT

In this paper, we propose a novel text promptable surgical instrument segmentation approach to overcome challenges associated with diversity and differentiation of surgical instruments in minimally invasive surgeries. We redefine the task as text promptable, thereby enabling a more nuanced comprehension of surgical instruments and adaptability to new instrument types. Inspired by recent advancements in vision-language models, we leverage pretrained image and text encoders as our model backbone and design a text promptable mask decoder consisting of attention- and convolution-based prompting schemes for surgical instrument segmentation prediction. Our model leverages multiple text prompts for each surgical instrument through a new mixture of prompts mechanism, resulting in enhanced segmentation performance. Additionally, we introduce a hard instrument area reinforcement module to improve image feature comprehension and segmentation precision.

surgery, text promptable surgical instrument segmentation, vision-language model

Neural Information Processing Systems

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (0.65)
Information Technology > Artificial Intelligence > Natural Language (0.65)

Add feedback